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(57) Abstract: A method for converting a quantity of input data to output data of lesser dimension. A linear principal component 
analysis technique is applied to the input data to generate a plurality of linear principal components and an error signal. The error 
signal is input to a first neural network which outputs at least one variable, the said at least one variable is input to a second neural, 
network and the first and second neural networks are configured such that the output of the second neural network is substantially 
equal to the input to the first neural network. The output data is represented by the said plurality of linear principal components and 
the said at least one variable. There is also provided a method of dynamically modelling a paper manufacturing plant comprising 
derivation of a function which takes as input a plurality of parameters of said paper manufacturing plant which have been reduced in 
dimension and outputs a value indicative of a quality of paper output from the paper manufacturing plant. 
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A METHOD OF DYNAMICALLY MODELLING A PAPER MANUFACTURING 

PLANT 

USING PCA (PRINCIPAL COMPONENT ANALYSIS) 

The present invention relates to a method for converting a quantity of input data to 
output data of lesser dimension. The present invention alsb relates to a method of 
modelling a paper manufacturing plant so as to allow optimisation of various 
parameters within the manufacturing process. 

A paper manufacturing plant comprises two distinct phases - a first wet end and a - 
second dry end. The present invention is concerned with parameters of the wet end 
process, which affect both phases of the paper manufacturing plant 

The wet end process involves input of fibres, water and chemicals. The wet end 
process comprises a number of complex processes including considerations of fluid 
dynamics, chemical reactions and physical reactions. This multi-input process is 
further complicated by various machine operating considerations such as speed of 
operations, chemical inputs and the particular configuration of the paper 
manufacturing machine 

Currently, systems exist which allow monitoring of wet end inputs and allow 
alteration of these inputs in response to. obtained measurements. One such system is a 
closed loop pH control system. This system measures pH within the head box and 
adjusts one or more inputs so as to ensure that the pH remains within a predetermined 
range in order to produce paper, with predetermined properties. This closed loop 
system is not currently in widespread use. Additionally, the system provides no 
indication of likely changes in output qualities in response to these input changes. 

Currently wet end parameters are adjusted according to an- operator's individual 
experience and expertise using a cc best guess" approach. While experienced operators 
may achieve good results using this method, it is likely that those with less experience 
will be unable to obtain satisfactory settings of these parameters, thereby yielding 
inconsistent results. Additionally, relying on such human intuition is never likely to 
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result in optimum paper manufacture parameters, no matter how much experience the 
operator may possess. 

Recently, a wet end information centre (WIC) system has been employed in some 
paper manufacture plants in order to obtain a wide variety of data relating to various 
operating parameters of the wet end paper manufacture process. Additionally, data 
relating to various user specified parameters such as speed is readily available. 

Despite having this large quantity of data available, use of this data as a basis for a 
model of the paper manufacturing process has not heretofore been considered. Such a 
model would ideally allow prediction of predetermined output characteristics in 
response to changes in predetermined input characteristics. However, given the 
quantity of input data and the complex peimutations of this data that must be 
considered to derive optimum parameter values, a model cannot be created and 
optimised using available computing power. 

It is an object of the present invention to obviate or mitigate one or more of the 
problems outlined above. 

According to a first aspect of the present invention there is provided a method for 
converting a quantity of input data to output data of lesser dimension, comprising: 

applying a linear principal component analysis technique to the input data to 
generate a plurality of linear principal components and an error signal;. 

inputting said error signal to a first neural network which outputs at least one 
variable; 

inputting the said at least one variable to a second neural network; and 
configuring the first and second neural networks such that the output of the 

second neural network is substantially equal to the input to the first neural network; 

the output being represented by. the said plurality of linear principal 

components and the said at least one variable. 
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Preferably, the first and second neural networks are trained until the difference 
between the output of the second neural network and the input to the first neural 
network is less than a predetermined threshold. Furthermore, neurons may be added 
to the first and second neural networks until the difference between the output of the 
second neural network and the input to the first neural network is less than a 
predetermined threshold. A difference between the input to the first neural network 
and output of the second neural network may be assessed using an auto correlation 
technique. 

Preferably, the first and second neural networks each comprise an input layer having 
connections with a hidden layer which in turn has connections with an oiitput layer, 
such that the output layer of the first neural network forms the input layer of the 
second neural network. Preferably, the hidden layer of each neural network has more 
neurons than the input layer of the first neural network, and more preferably, the 
hidden layer of each neural network has twice as many neurons as the input layer of 
the first neural network. The input layer of the first-neural network may have an equal 
number of neurons to the output layer of the second neural network. 

The method may further comprise using the output data as input to a model of a 
process, such that the model outputs at least one value representing a property of that 
process, comparing the value representing the said property output from the model 
with a measured value for that property, and training the model so as to adjust 
weights of nodes within the model until the difference between the value for the 
property output from the model and the measured value of the property is less than a 
predetermined threshold. The model may be implemented by means of a neural 
network. 

According to a second aspect of the present invention, there is provided a method of 
dynamically modelling a paper manufacturing plant comprising derivation of a 
function which takes as input a plurality of parameters of said paper manufacturing 
plant which have been reduced in dimension and outputs a value indicative of a 
quality of paper output from the paper manufacturing plant. 
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Preferably, the plurality of parameters comprise a plurality of temporally ap** 
values for at least one parameter of the paper manufacturing plant. The function may 
also take as input values previously output from the function. The function may be 
provided by an artificial neural network. 

At least some of the plurality of parameters may be output from a method according 
to the first aspect of the present invention. 

The model may have a general form: 

where: R represents a quality of the output paper; 
k is a current sample; 
k-1 is a previous sample; 

ti,...t4 are parameters of the paper manufacturing plant 
U(k^ is a variable related to chemical costs; 
d is time delay; and 

/ is the time period over which previous measurements are to be taken into 
account. 

A plurality of models are preferably generated and combined to generate a 
performance function. Optimisation of the performance Amotion may seek to 
maximise at least one of the outputs of the plurality of models. Optimisation of the 
performance function may seek to focus at least one of the outputs of the plurality of 
models upon a predetermined target. 

The performance function may have a form: 

where: J is an output from the function; 

A and B are outputs from models generated using a method according to any 
one of claims 1 to 10; 
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C is a variable representing all adjustable inputs that are to be minimised; and 
a and b are constants. 

Alternatively, the performance function may have a form: 

J =a(A-T) z +- + C 
B 

where: J is an output from the function; 

A and B are outputs from models generated using a method according to any 
one of claims 1 to 10; 

Tis a target upon which A is to be focussed; and 
a and b are constants. 

Alternatively, the performance function may have a form: 

J = a(A-T) 2 +b{B-T 2 ¥ +C 

where: J is an output from the function; 

A and B are outputs from models generated using a method according to any 
one of claims 1 to 10; 

T and T 2 are target values for A and B respectively; and 
a and b are constants. 

Any of the performance functions detailed above may include at least one other tenn. 

An embodiments of the present invention will now be described, by way of example, 
with reference to the accompanying drawings, in which 

Figure 1 is a schematic illustration of a known paper manufacturing plant; 

Figure 2 is a schematic illustration of inputs to and outputs from the wet end paper 
manufacturing process operated in the plant of figure 1; 



Figure 3 is a flow chart of a method in accordance with the present invention; 



WO 03/074784 

PCT/GB03/00943 

6 

. Figure 4 is a scheme illustration of two Artificial Neural Networks nsed in a 
preferred embodiment of the present invention; and 

Figure 5 is a graph showing rcaui* obtaiMd ^ m ^ 
compare mput to and output fiom the networks of Figure 4. 

Referring to figure 1, a complete paper manufacturing installation is illustiated in 
ootinre . The installation has a plurality of inputs, ma, is chemicnls (represented by an 
™ 1X « b * m — » 2) and water (presented by an arrow 3) 

Ttae inputs are all combine* by a mixer 4 to form a raw pulp which is fed into a 
head box 5. Tie head box 5 feeds the pulp a, a controlled rate onto a movin* wire 
table 6 upon which tie paper is formed. The wire table 6 comprises a number of 
vacuum boxes 7 which exert a force on the paper so as to extract fiom it as much 
water as possible. Water removed fiom paper on the wire table 6 by the vacrmm boxes 
7 is fed to a white water tank 8. 

Water reaching the white water tank 8 can be recycled by inpn. «o the mixer unit 4 a. a 
later stage. It will be appreciated tha, the white water must be reasonably pure in 
order to be effectively recycled. The quantity of chemicals reaching the white water 
tank should, where possible, be minimised. 

Paper passing over the wire table 6 is kept flat by movement of a further rotable 
wtre table 9 which is positioned above the wore table 6. The space between these two 
wn-e tables 6, 9 is adjusted such that paper of desired thickness is manufactured. 

The elements of the manufacturing process as described above make np the wet end 
process and it is mis section of the manufacturing p!an, that the present invention 
seeks to optimise by use of modening techniques. The remainder of the process 
known as the dry end, will now be described forme sake of completeness. 

Onoe formed on the wire table 6, the paper is passed through a press section 
comprrsing four rollers 10. Tbe rollers 10 exert considerable pressure on the paper so 
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as to remove as much water as possible by a squeezing action. Having passed through 
the press section the paper passes over a number of rollers 11 in a drying section 
where a squeezing action is continued and is enhanced by the effects of heat so as to 
remove water by evaporation. 

The final stage of the process is known as a calendar section. Here, the paper is passed 
around a number of rollers 12 so as to obtain a paper having desired properties. For 
example, the calendar section may polish the paper so as to remove any roughness. 
Paper leaving the calendar section is rolled about a roller 13 which is the end process 
of the section. Those skilled in the art will realise that some paper manufacturing 
plants do not include a calendar section. 

As described above, a number of inputs are mixed together to form the raw pulp, 
namely chemicals, water and fibres. The correct balance of these individual 
components is essential in order to obtain a finished product having desired 
properties. The paper plant may use a WIC system in order to obtain information 
about various operating parameters in the wet end of the manufacturing process. One 
known WIC system collects twenty four variables classified into four groups, each 
group having six measurements viz pH, temperature, turbidity, cationic demand, 
alkalinity and conductivity. The various groups are distributed across the wet end of 
the paper manufacture process. The measurement cycle provided is relatively slow, 
taking of the order of 15 minutes to obtain a complete set of measurements. This WIC 
data is combined with other user defined measurements such as wire table speed and 
pulp flow rate, and details of chemical properties of raw materials that are input to the 
process. 

The combined data amounts to some eighty four variables and these variables offer a 
complete overview of the wet end manufacture process. Additionally, these variables 
are directly related to properties of the paper that is output from the manufacturing 
process such as strength and retention of chemical inputs. This is schematically 
illustrated in figure 2 where the three inputs to the process are shown on the left hand 
side of the diagram, the process is schematically represented in the centre and output 
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characteristics are represented on the right hand side of the diagram. The present 
invention provides a model relating these inputs to these outputs. 

The model relies upon artificial neural networks (ANNs). Artificial neural networks 
are well known as is their use in modelling to provide an output prediction based on a 
plurality of input variables. The networks are trained using historical data such that 
the output accurately represents the real world response in such circumstances. 

Typically, an ANN has the form: 



y = 9 T tang 


f 

a> T 












J 



where: 



+Bl 

\ \ u « j j 

y isB. model output- 
is a vector containing n input variables; 



0) 



Q ,m ,B t ,B 2 are vectors containing weights for each of the n input 
variables, T denoting that the vector has been transposed; and 

tang is the standard sigmoid or hyperbolic tangent function. 

Usually, a neural network is trained so as to determine optimum weights, before 
inputs are used to predict an output In a preferred embodiment of the present 
invention, ANNs are trained using an approach similar to that described in R. Noreiga 
and H. Wang: "A direct adaptive neural netwoik control for unknown non-linear 
systems, and its application", IEEE Transactions on Neural Networks, Volume 9, pp 
27-34, 1998. 



It was mentioned above that data collected from various parts of the wet end paper 
manufacturing process amounts to a large number of variables. It is not possible to 
use this quantity of data as input to an ANN and achieve a reasonable result within a 
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reasonable time. Therefore, techniques are required to reduce the dimension of the 
collected data. 

Input data to a neural network can be considered to comprise n variables each having 
values for m time periods. That is a set of data {xj ^ ... , x n ] which can be 
represented by H and is" in fact: 



£7 — 



x n x zi 



(2) 



The present invention employs principal component analysis techniques to reduce the 
dimension of the collected data. That is, all data is input to a first process which 
reduces the dimension of the collected data. Linear and non-linear principal 
component analysis techniques are well known. In accordance with the present 
invention, the two techniques are combined to achieve the required reduction in data 
dimension. 

The process is illustrated in figure 3. The input data H is passed through a linear 
principal' component analysis module 14 to generate a plurality of linear principal 
components *i,...,/ m and an error signal E. The generated error signal E is fed into a 
non-linear principal component analysis module 15 which contains two ANNs 
denoted f 1 and f. An output from the non-linear principal component analysis module 
15 comprises one or more non-linear principal components tp,...,t q . The linear and 
non-linear principal component modules form input to a model 16. The function of 
the linear and non-linear principal component analysis modules is described in further 
detail below 
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The linear principal component analysis technique takes the data set S as input, and 
outputs a plurality of linear principal components and an error signal. The operation 
of this linear technique is described below. 



Given the input data set 3 as in equation (2), a matrix A may be formed such that 
A = 2 T E ' (3) 

where 

H T is the transpose of S. 
Eigenvalues can be computed for A to give a set of values: 
X x >Xj >Aj > ...> X n 

where 

X } ..Jl rt are Eigenvalues; and 
n is the dimension of A. 



The original dataset H can be expressed in terms of a number of components: 
s= *'iPi +'2*2 +~+'*Pji +£q (5) 



where 

ti is the principal component related to X t and is calculated as presented below; 
p, is the transpose of the Eigenvector of the corresponding Eigenvalue X. ; and . 
Eo is an error signal; 

Determination of ti will be presented by way of example. t, is calculated by taking 
equation (5) and multiplying through by pi to give: 
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SpJ ^'iPrfi +t 2 P 2 pJ +~+t„P lt pJ +E oP J (6) 
given that t lPtP ] =0 ( Vi : i * J) , and E oP J =0 , equation (6) reduces to: 

ZpJ=t lPl pJ (7) 



(8) 



/2 -- ./ji can be similarly computed. 

Selection of principal components from equation (6), by taking the first m terms 
reduces the data set to mat, for example, of equation (9). The value of m will 
determine the accuracy of the simplification. If m=2, then: 

S=< l^I+'2/>2+-£, (9) 

where B t is an error signal and all other terms take values as hereinebefore described. 
The dimension of Ei will be equal to the dimension of the input data set H . 

In accordance with the flow diagram of figure 3, the error signal E is fed as input to 
the non-linear principal component analysis module 15. The purpose of the non-linear 
principal component analysis module 15 is to convert the error signal into a plurality 
of non-linear principal components, thereby reducing the dimension of the error signal 
E(. This is achieved by considering an equation of the form: 

E i = ./X'j»0 (10) 



where: 
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/ is a non-linear function whose form may be determined using a neural 
network; and 

f j and t 4 are non-linear principal components whose form is determined using 
a neural network. 

Deterrnination of/ tj, t 4 will now be described. Given t lPl and f 2i 7 2 from equation 
(9), a matrix X may be constructed such that: 

X = E-t lPi -t 2 p 2 (n) 

This matrix X is equivalent to the error, signal E and is input to the non-linear 
principal component analysis module 15. It can be seen from figure 3 that the non- 
linear module 15 uses two neural networks. The structure of these two ANN'S is 
illustrated in figure 4. 

Referring to figure 4, a first ANN comprises three layers 17, 18, 19 and a second 
ANN comprises a layer 19 in common with the first ANN together with two further 
layers 20, 21. 



The first ANN is considered to correspond to the function/- 1 , while the second is 
considered to correspond to the function/, where/" 1 is the inverse off. The output of 
the first ANN/' 1 is the input to the second ANN/and this output comprises t 3 , t 4 . The 
relationship is as represented in equation (10). 

Since/- 1 is the inverse of/ the input X and output X should be equal. Given this 
information, and a known form of ANN, the weights of each neuron of the ANN may 
be determined so as to calculate/ and t 3 , U with sufficient accuracy by means of a 
tiaining algorithm such as that referred to above. 



The first and second ANNs each comprise one or more layers of neurons. When 
creating suitable ANNs, it is possible to initially use a small number of neurons, and 
to increase this number in both the first and second ANNs concurrently. After each 
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neuron addition, the output of the second ANN is compared to the input to the first 
ANN and the increase in neuron number is continued until the comparison shows that 
a difference between values being compared is below a predetermined threshold- Each 
of the first and second ANNs has an equal number of neurons. It has been found that 
creating ANN's where the central (or hidden) layers 18 7 20 of each ANN have twice 
as many neurons as the input layer 14 of the first neural network gives satisfactory 
results. 

The network of figure 4 is a well known ANN structure generally referred to as a 
feed-forward back propagation network. The central layer 19 of the structure, 
determines the principal components and thus the number of neurons in this layer 
determines the efficiency of the data reduction provided. 

Configuring the ANN such that the outputs and inputs X,X are substantially equal is 
carried out using an auto correlation technique. The input data X is a matrix having 
equal dimension to the input data applied to the linear principal component alanysis 
module 14 and is made up of a plurality of row vectors, each row vector containing 
temporally spaced values for a particular measurement. Each row represents a 
different measurement such that the matrix has as many rows as there are 
measurements . 

Auto correlation is performed in a conventional way, such that differences between 
the matrix X and the matrix X are compared. The result of the auto correlation test is 
presented in figure 5. The x-axis represents shift, and the y-axis represents the quality 
of correlation at that shift. Therefore, figure 5 shows a good correlation at shift zero, 
as would be expected. A large central peak (height A), which is reasonably narrow 
(width B), and small peaks of maximum height C show that the differences between 
X and X are small and are mainly made up of independent noise (also known as white 
noise), that can not be modelled. However, the illustration of figure 5 shows that there 
is little white noise > denoted by smaller peaks (i.e. correlation at non-zero shift 
values), thereby further indicating that a good correlation has been obtained 
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The auto correlation process can be conveniently carried out using MATLAB® 
(distributed by The MathWorks, Inc., 3 Apple HiU Drive, Natick, MA 01760-2098 
United States of America), using an "xcorr" iunction supplied therein. The correlation 
data shown in figure 5 is generated by MATLAB, and only data in a specified auto 
correlation area is used in the correlation calculations plotted in figure 5.- This area is 
defined as that between the second and third maxima in the data set 

The effectiveness of the correlation illustrated in figure 5 is mathematically 
represented by a factor F given in equation (12): 

p _ Width * SecondMaxPeak 

MaxPeak ■ ( 12 ) 



where, 

F is a Factor of the auto correlation in the specified area; 
Width is a distance (along the x axis)between the two lowest values within the 
illustrated area; 

SecondMaxPeak is the second highest peak within the area; and 
MaxPeak is the height of the maximum peak (A in figure 5) 

The width of the specified area auto correlation is the distance between the two 
smallest values within the plotted area. Width can therefore by computed according to 
equation (13) 

mdth = \pos 1 -pos 2 \ (13) 



where: 

POS1 is the position (on thex-oxfc of figure 4) of foe minimum value; and 
POS2 is the position (on the x-axis of figure 4jof the second lowest value. 
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The highest peak is calculated by subtracting the maximum and minimum values 
plotted in the specified area, 

MaxPeak = max(KAL)-niin(K4I) • (14) 

where, 

VAL is a vector of all values plotted in figure 5; 

max() is a predefined function taking a vector, and returning the maximum 
value contained within that vector; and 

min() is a predefined function taking a vector, and returning the 
minimum value contained within that vector; and 

SecondMaxPeak = VAL{\2) -min(F^Z) (15) 
where, 

VAL{\2) is the second highest value of vector VAL; and 
min is a predefined function as defined above. 

The factor F calculated using equation (12) is a measure of the accuracy of the 
correlation. In a preferred embodiment of the present invention, all values input to the 
module 15 (that is input to layer 17 of figure 4) are normalised (that is take values 
between -1 and -H). In such a circumstance it has been found that training the neural 
network until Fhas a value less than 1 provides good results. Further training is likely 
to improve the accuracy of results obtained, but it is likely that such training will 
result in relatively modest improvements. 

Having performed both the linear and non-linear principal component analysis* 
techniques described above, the linear and non-linear principal components tj, tj, h, 
and r^can be used to represent the input dataset. Thus, the dimension of the input data 
has been reduced considerably. 



Referring back to figure 3, it can be seen that the non-linear principal components /j. 



WO 03/074784 — <v 

PCT/GB03/00943 

16 

and * are taken from the data output from the first ANN f 1 . These, together with the 
linear principal components t h and t 2 , areused as input to the model 16. The model 16 
is recursive, taking previous output values as input as illustrated in figure 3. This 
allows the model to be dynamic in nature, as is explained below. 

The model 16 is represented by at least one further ANN having a form as set out in 
equation^!). At least one output relating to the input values,,. ^ t h and t 4 is predicted 
by the model 16. In use, measurements will be taken and used to generate input into 
the model to predict an output This output should match at least one predeterfnined 
property of the manufactured paper, for example paper strength. 

This problem is complicated because generally the model 16 will incorporate a 
uumber of ANNs will be operating concurrently, each having the ability to optimise 
one criteria of the paper machine output. These ANNs must all be optimised 
concurrently, so as to obtain input parameters to create output which accurately 
reflects all desired properties.. Additionally, it is desirable to optimise chemical input 
quantities so as to control costs. A method of performing this optimisation will now 
be described in terms of a system where output strength and chemical retention are 
both to be optimised. 



Suitable equations for each of these characteristics 



are: 



where 

S represents the strength of output paper; 

R represents retention of raw materials in the output paper; 

g and/are suitably trained ANNs; 

k is the current sample; 

k-1 is the previous sample; 



(16) 
(17) 
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t x r 4 are as hereinbefore defined; 

U(k-d) is a variable related to chemical costs; 
d is time delay; and 

/ is the time period over which previous measurements are to be taken into 
account. 

Equations (16) and (17) provide a dynamic model by linking values at a time k to 
values at a time fc-1. The presence of R and S on the right hand side of the equation 
provides a feedback loop, such that previous output values affect the current output 
value. 

It is desired to use the system of the present invention so as to predict output 
properties using currently known values. The equations may be modified to do this 
such that: 

(18) 
(19) 

In order to ensure that equations (18) and (19) accurately reflect measured properties 
of the manufactured paper, modelled and measured values should be compared. This 
can conveniently be done using an auto correlation technique similar to that described 
above. If the measured and modelled values are not sufficiently close, amendments to 
the model are necessary. 

In a paper manufacturing plant, a number of dynamic models having the form of 
equations (18) and (19) are created. It is necessary to combine the models of 
equations (18) and (19) so as to obtain a performance function made up of the two 
models. A number of suitable equations have been developed, and are detailed in 
equations (20) to (22) below. 
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r a b 2 

j*=<s m -t?+Jl-+cu; (21) 



(22) 



where: 

J i > ^ 2 , J 3 are variables representing p erformance; 
a, 6 and c are constants; and 
T and T r are target values. 



Referring to equation (20) it can be seen that nmnmising J, will result in maximising 
both R k+d and S k+d , while nhnimising U. As both Strength and Retention (R) 
should be maximised, and Chemical Inputs (^should be minimised, it can be seen 
that J , is effective as a performance function. 

Equation (21) is such mat its first term (S ki . d —I") 2 will be a mhiimum when 
S M =T. Thus, minimi sing / 2 wiU result in maximising of Retention (R), 
minimising of chemical inputs, and focusing Strength (S) on its target value. 

Equation (22) provides a performance function which can be niinimised to allow both 
strength and retention to be focussed upon predetermined targets. 

Referring back to equations (18) and (19), it will be seen that each of 
S k>K t iw>t 2m ,t Kk) ,t <U0 are known; t i{k) , t 2(k) ,t i{k)> t 4(k) having been obtained as 
output from the data reduction method and S k ,R k> being obtainable through 
measurement. The value of C/k is also known. 
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The only directly variable parameter is U ^ An initial value is obtained by carrying out 
data analysis based upon past performance. If this does not yield desired values for JJ 
in any of equations (20), (21) or (22), U k is varied, until it reaches an optimum value, 
which may be determined by: 



U M =U k -Ajt (23) 



where A is a constant. 



It will be appreciated that similar principles may be applied to the creation and 
optimisation of a performance function made up of more than two models, including 
models representing qualities such as breaking frequency, effluent quality, and energy 
consumption. Equation (24) provides an example of such as model: 

J 4 ="(Sk+<i -T) 2 ~T r ) 2 +Bf+Eq + Ec+cU t 2 (24) 

where all terms are as hereinbefore defined and: 
J 4 is a performance function; 

Bfis a term involving a model representing breaking frequency; 
Eg is a term involving a model representing Effluent Quality; and 
Ec is a term involving a model representing Energy consumption 
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CLAIMS 



L A method for converting a quantity of input data to output data of lesser 
dimension, comprising: 

applying a linear principal component analysis technique to the input data to 
generate a plurality of linear principal components and an error signal; 

inputting said eiror signal to a first neural network which outputs at least one 
variable; 

inputting the said at least one variable to a second neural network; and 
configuring the first and second neural networks such that the output of the 

second neural network is substantially equal to the input to the first neural network; 

the output data being represented by the said plurality of linear principal 

components and the said at least one variable. 

2. A method according to claim 1, wherein the first and second neural networks 
are trained until the difference between the output of the second neural network and 
the input to the first neural network is less than a predetermined threshold. 

3. A method according to claim 1 or 2, wherein neurons are added to the first and 
second neural networks until the difference between the output of the second neural 
network and the input to the first neural network is less than a predetermined 
threshold. 

4. A method according to any preceding claim, wherein a difference between the 
input to the first neural network and the output of the second neural network is 
assessed using an auto correlation technique. 
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5. A method according to any preceding claim, wherein the first and second 
neural networks each comprise an input layer having connections with a hidden layer 
which in turn has connections with an output layer, the output layer of the first neural 
network forming the input layer of the second neural network. 

6. A method according to claim 5, wherein the hidden layer of each neural 
network has more neurons than the input layer of the first neural network. 

7. A method according to claim 6, wherein the hidden layer of each neural 
network has twice as many neurons as the input layer of the first neural network. 

8. A method according to claim 5, 6 or 7, wherein the input layer of the first 
neural network has an equal number of neurons to the output layer of the second 
neural network. 

9. A method of dynamically modelling a paper manufacturing plant comprising 
derivation of a function which takes as input a plurality of parameters of said paper 
manufacturing plant which have been reduced in dimension and outputs a value 
indicative of a quality of paper output firom the paper manufacturing plant. 

10. A method according to claim 9, wherein the plurality of parameters comprise a 
plurality of temporally spaced values for at least one parameter of the paper 
manufacturing plant. 

11. A method according to claim 9 or 10, wherein the function also takes as input 
values previously output from the function. 

12. A method according to claim 9, 10 or 1 1, wherein the function is provided by 
an artificial neural network. 



WO 03/074784 ^ — PCT/GB03/00943 

22 

13. A method according to claim 9, 10, 11 or 12, wherein at least some of the 
plurality of parameters are outputs from a method according to any one of claims 1 to 
8.' 

14. A method according to any one of claims 9 to 13, wherein the model has a 
general form: 

R k+d = g(S k , S k ^ , S k _ 2 S MM) , R ki , R k _ } , R k _ 2 R kHJ ^ ,t m , i 2{k) , i 2{k) , t AW , U w ) 
where: R represents a quality of the output paper; 

k is a current sample; 

k-1 is a previous sample; 

tj,. . .t4 are parameters of the paper manufacturing plant 
U(bQ is a variable related to chemical costs; 
d is time delay; and 

/ is the time period over which previous measurements are to be taken into 
account. 

15. A method wherein the outputs of a plurality of models generated using the 
method of any one of claims 9 to 14 are combined to generate a performance function. 

16. A method according to claim 15, wherein optimisation of the performance 
function seeks to maximise at least one of the outputs of the plurality of models. 

17. A method according to claim 15, wherein optimisation of the performance 
function seeks to focus at least one of the outputs of the plurality of models upon a 
predetermined target. 

18. A method according to claim 15 wherein the performance function has the 
form: 

r a b ~ 

J = — +— +C 
A B 

where: J is an output from the function; 
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A and B are outputs from models generated using a method according to any 
one of claims 1 to 8; 

C is a variable representing all adjustable inputs that are to be minimised; and 
a and b are constants. 

19. A method according to claim 15, wherein the performance function has the 
form: 

J = a (A-T) 2 +— + C 
B 

where: J is an output from the function; 

A and B are outputs from models generated using a method according to any 
one of claims 1 to 8; 

C is a variable representing all adjustable inputs that are to be minimised; 

T is a target upon which A is to be focussed; and 

a and b are constants. 

20. A method according to claim 15, wherein the performance function has the 
form: 

J = a{A - D 2 + b{B -T 2 > 2 + C 
where: J is an output from the function; 

A and B are outputs from models generated using a method according to any 
one of claims 1 to 10; 

Zand 7*2 are target values for A and B respectively; and 

a and b are constants. 

21. A method of optimising operation of a paper manufacturing plant, comprising 
maximising a performance function generated using a method according to any one of 
claims 15 to 20. 



22. 



A computer program for carrying out the method of any preceding claim. 
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23. A data carrier medium carrying computer program code means to cause a 
computer to execute procedure in accordance with the method of any one of claims 1 
to 21. 

24. An apparatus for carrying out a method according to any one of claims 1 to 2 1 . 

25. A method substantially as hereinbefore described with reference to figures 3 to 
5 of the accompanying drawings. 

26. An apparatus for carrying out a method substantially as hereinbefore described 
with reference to figures 3 to 5 of the accompanying drawings. 

27. A computer program for carrying out a method substantially as hereinbefore 
described, with reference to figures 3 to 5 of the accompanying drawings. 
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